svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery

نویسنده

  • Nicolas Turenne
چکیده

We present a new R package which takes a numerical matrix format as data input, and computes clusters using a support vector clustering method (SVC). We have implemented an original 2D-grid labeling approach to speed up cluster extraction. In this sense, SVC can be seen as an efficient cluster extraction if clusters are separable in a 2-D map. Secondly we showed that this SVC approach using a Jaccard-Radial base kernel can help to classify well enough a set of terms into ontological classes and help to define regular expression rules for information extraction in documents; our case study concerns a set of terms and documents about developmental and molecular biology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

svcR: a package for Support Vector Clustering improved with Geometric Hashing. Application to Lexical Pattern Discovery

We developed an R toolkit to manage data described by attributes, able to make clusters with a support vector clustering method (SVC). We have implemented an original 2D-grid labeling approach to extract clusters to optimize time processing. In this sense, svc can be seen as an efficient cluster extraction if clusters are separable in a 2-D map. Secondly we showed that this SVC approach using a...

متن کامل

A Regularized Nonsmooth Newton Method for Multi-class Support Vector Machines

Multi-class classification is an important and on-going research subject in machine learning. Recently, the ν-K-SVCR method was proposed by the authors for multi-class classification. Since many optimization problems have to be solved in multi-class classification, it is extremely important to develop an algorithm that can solve those optimization problems efficiently. In this paper, the optimi...

متن کامل

Detection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods

Introduction: In this paper, a method is presented to classify the breast cancer masses according to new geometric features. Methods: After obtaining digital breast mammogram images from the digital database for screening mammography (DDSM), image preprocessing was performed. Then, by using image processing methods, an algorithm was developed for automatic extracting of masses from other norma...

متن کامل

Detection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods

Introduction: In this paper, a method is presented to classify the breast cancer masses according to new geometric features. Methods: After obtaining digital breast mammogram images from the digital database for screening mammography (DDSM), image preprocessing was performed. Then, by using image processing methods, an algorithm was developed for automatic extracting of masses from other norma...

متن کامل

Regularized nonsmooth Newton method for multi-class support vector machines

Multi-class classification is an important and on-going research subject in machine learning. Recently, the ν-K-SVCR method was proposed by the authors for multi-class classification. Since many optimization problems have to be solved in multi-class classification, it is extremely important to develop an algorithm that can solve those optimization problems efficiently. In this paper, the optimi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1504.06080  شماره 

صفحات  -

تاریخ انتشار 2010